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Abstract — By exploiting multipath fading channels as a source 
of conunon randomness, physical layer (PHY) based key gener- 
ation protocols allow two terminals with correlated observations 
to generate secret keys with information-theoretical security. The 
state of the art, however, still suffers from major limitations, e.g., 
low key generation rate, lower entropy of key bits and a high 
reliance on node mobility. In this paper, a novel cooperative 
key generation protocol is developed to facilitate high-rate key 
generation in narrowband fading channels, where two keying 
nodes extract the phase randomness of the fading channel 
with the aid of relay node(s). For the first time, we explicitly 
consider the effect of estimation methods on the extraction of 
secret key bits from the underlying fading channels and focus 
on a popular statistical method-maximum likelihood estimation 
(MLE). The performance of the cooperative key generation 
scheme is extensively evaluated theoretically. We successfully 
establish both a theoretical upper bound on the maximum secret 
key rate from mutual information of correlated random sources 
and a more practical upper bound from Cramer-Rao bound 
(CRB) in estimation theory. Numerical examples and simulation 
studies are also presented to demonstrate the performance of 
the cooperative key generation system. The results show that the 
key rate can be improved by a couple of orders of magnitude 
compared to the existing approaches. 

Index Terms — Key generation, cooperative networking, multi- 
path channel, single-tone estimation, maximum likelihood esti- 
mation, wireless network. 

! I. Introduction 

A fundamental problem of all wireless communications 
is the secure distribution of secret keys, which must 
I be generated and shared between authorized parties prior to 
' the start of communication. In the field of cryptography, the 
Diffie-Hellman key exchange protocol is one of the most 
basic and widely used cryptographic protocols for secure key 
establishment. The essential idea behind the Diffie-Hellman 
key exchange is that: two parties that have no prior knowledge 
of each other to jointly establish a shared secret key over 
an insecure communication channel. However, the protocol 
assumes the adversary has bounded computation power and 
relies upon computational hardness of certain mathematical 
problems to achieve secure key generation. This body of 
cryptographic protocols achieve computational security. 

Recently, the notion of physical layer (PHY) based key 
generation has been proposed and the resulting approaches 
serve as alternative solutions to the key establishment problem 
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in wireless networks. Based on the theory of reciprocity 
of antennas and electromagnetic propagation, the channel 
responses between two transceivers can be used as a source 
of common randomness that is not available to adversaries in 
other locations. Such source of secrecy, which is provided by 
the fading process of wireless channels, can help to achieve 
information-theoretical security. This body of work can be 
traced back to the original information-theoretical formulation 
of secure communication due to (H]. Building on information 
theory and following |T1, information theorists characterized 
the fundamental bounds and showed the feasibility of gen- 
erating secrets using auxiliary random sources ||2l, |[3l, (lU. 
However, they are almost all based on theoretical results 
and do not present explicit constructions. To the best of our 
knowledge, Hershey et al. proposed the first key generation 
scheme based on differential phase detection in |5|. Using 
multipath channels as the source of common randomness, 
recent researches focus on measuring a popular statistic of 
wireless channel, i.e., received signal strength (RSS), for 
extracting shared secret bits between node pairs ||6l, Q, (Hi- 
lt has been demonstrated that these RSS based methods are 
feasible on customized 802.11 platforms. The state of the art, 
however, still suffers from major limitations. First, the key 
bit generation rate supported by these approaches is very low. 
This is due to the fact that the PHY based key generation relies 
on channel variations or node mobility to extract high entropy 
bits. In the time intervals where channel changes slowly, only a 
limited number of key bits can be extracted. The resulting low 
key rate significantly limits their practical application given the 
intermittent connectivity in mobile environments. To increase 
the key rate, Zeng et al. proposed a key generation protocol 
by exploiting multi-antenna diversity |9|. But it also leads to 
an increase in the complexity of the transceivers. Second, the 
generated raw key bit stream has low randomness. This is 
because the distribution of the RSS measurements or estimates 
is not uniform, which results in unequally likely bits after 
quantization. As cryptographic keys need to be as random as 
possible so that it is infeasible to reproduce them or predict 
them, it is important to ensure high entropy of the generated 
keys. However, the problem of how to safely and efficiently 
generate random key bits using channel randomness is still 
open. 

To overcome the above limitations, in this paper, we in- 
vestigate the problem of cooperative key generation between 
two nodes with the aid of third parties, i.e., relay nodes. The 
introduction of the relay nodes is motivated by the diversity 
gain provided by the relay nodes, which can potentially help 
to increase the key rate by furnishing the two nodes additional 



correlated randomness. To enhance the level of entropy of bit 
sequences, we propose to exploit the uniformly distributed 
channel phase for key generation. Specifically, we develop 
a novel time-slotted cooperative key generation scheme by 
exploiting channel phase randomness under narrowband fading 
channels. For the first time, we explicitly consider the effect 
of estimation methods on the extraction of secret key bits 
from the underlying fading channels and focus on a popu- 
lar statistical method-maximum likelihood estimation (MLE). 
The main features of the proposed scheme are: i) The key 
bit generation rate is improved by a couple of orders of 
magnitude compared to RSS based approaches. This is due to 
the high-accuracy MLE and the fact that the random channels 
between the relay and the keying nodes can be effectively 
utilized during a single coherence time. That also implies 
the proposed scheme can even work in a static environment 
where channels change very slowly; ii) The generated bit 
stream is very close to a truly random sequence due to the 
use of uniformly distributed channel phase for bit generation; 
iii) It is robust to relay node compromise attacks since each 
relay node only contributes a small portion of key bits and a 
small number of them can never obtain the complete global 
key bit information even collectively. The performance of the 
cooperative key generation scheme is extensively evaluated 
theoretically. We successfully establish both a theoretical 
upper bound on the maximum secret key rate from mutual 
information of correlated random sources and a more practical 
upper bound from Cramer-Rao bound (CRB) in estimation 
theory. We also show that the cooperative gain in the key 
generation is similar to the beamforming gain in cooperative 
networking, i.e., the resulting gain is linear to the number 
of relay nodes. Numerical examples and simulation studies 
are also presented to demonstrate the performance of the 
cooperative key generation system. The results show that the 
key rate can be improved by a couple of orders of magnitude 
compared to the existing approaches. 

The rest of the paper is organized as follows: Section 
II gives problem formulation and introduces wireless fading 
channel model considered in this paper Section III discusses 
related work. Section IV provides the detailed description of 
our proposed cooperative key generation schemes. Section V 
and VI present the theoretical performance analysis and sim- 
ulation studies, respectively. Section VII provides a security 
discussion of the proposed scheme from both practical and 
theoretical aspects. Finally, Section VIII concludes the paper 

II. Problem Formulation and Preliminaries 

In this section, we first define the PHY based key genera- 
tion problem in wireless networks and introduce the general 
assumptions made in the existing work jS), Q, (SI, ifTOl . 
This will explain why wireless channel between a transmitter- 
receiver pair can be used as a source of common randomness 
for secret generation. Then we discuss two most common 
channel models and focus on the narrowband fading chan- 
nel, which is closely related to the key generation schemes 
developed in this paper 



A. Problem Formulation 

In a multipath fading wireless environment, the physical 
signals transmitted between a transmitter-receiver pair rapidly 
decorrelate in space, time and frequency. That implies that it is 
very hard for a third party to predict the channel state between 
the legitimate parties, i.e., an eavesdropper at a third location 
{e.g., one half of wavelength away) cannot observe the same 
channel response information. This channel uniqueness prop- 
erty of the transmitter-receiver pair offers potential security 
guarantees. Further, the channel reciprocity indicates the avail- 
ability of using common randomness between the transmitter- 
receiver pair: the electromagnetic waves traveling in both 
directions will undergo the same physical perturbations. That 
implies that in a time-division duplex (TDD) system, if the 
transmitter-receiver pair operates on the same frequency in 
both directions, the channel states/channel impulse responses 
observed at two ends will theoretically be the same. Based 
on these two observations, we can see that there exists a 
natural random source in wireless communications for secrecy 
extraction. 

Consider two parties A and B (we term them as keying 
nodes in the following discussion) that want to establish a 
symmetrical key between them in the presence of an eaves- 
dropper E. The keying nodes are assumed to be half-duplex 
in the sense that they cannot transmit and receive signals at 
the same frequency simultaneously. In the first timeslot, A 
transmits a signal xa to B, and E can also hear this signal 
over the wireless channel. The signals received by B and E 
are: 

tb = hABXA + ns 

TE = hAEXA+nE, 

where Hab and Hae are the channel gains from A Xo B and 
A to E, respectively, and ns and n^; are noises at B and 
E, respectively. In the second timeslot, B transmits a signal 
xb to A, and E can also hear this signal over the wireless 
channel. The signals received by A and E are: 

TA = hBAXB + riA 

rE = hBEXB+nE, 

where h ba and Hbe the channel gains from B to A and 
B to E, respectively, and ua and ue are noises at A and E, 
respectively. The channel from node i to node j is modeled as 
a multipath fading model with channel impulse hij{t). We as- 
sume channel reciprocity in the forward and reverse directions 
during the coherence time such that hij{t) — hj^i{t) and the 
underlying noise in each channel is additive white Gaussian 
noise (AWGN). In wireless communications, coherence time 
is a statistical measure of the time duration over which the 
channel impulse response is essentially invariant, and quanti- 
fies the similarity of the channel response at different times. 

The keying nodes A and B compute the sufficient statistic 
vb and fA, respectively, and generate the secret key based on 
these observations. In our system, we assume there exist N 
relay nodes, which are honest and will help and cooperate 
with the keying nodes A and B to generate secret keys. 
On the other side, the eavesdropper E knows the whole key 



generation protocol and can eavesdrop all the communications 
between legitimate nodes (i.e.. A, B and relay nodes). Based 
on communication theory IHJ, the signals transmitted between 
A and B and the signals transmitted between A (B) and E, 
which is at least A/2 away from the network nodes, experience 
independent fading. As an example, consider a wireless system 
with 900MHz carrier frequency. If an eavesdropper E is more 
than 16cm away from the communicating nodes, it experiences 
independent channel variations such that no useful information 
is revealed to it. Following the same assumptions in most 
key generation schemes [61, [O, lfT2l . ifTOl . we assume that 
the adversary E aims to derive the secret key generated 
between legitimate nodes and further steal the transmitted 
private information in the future. Those active attacks where 
the attacker tampers the transmissions are orthogonal to our 
research and thus not considered in this paper 

The above problem can be considered as a key generation 
problem in cooperative wireless networks in the presence of 
an eavesdropper. In this paper, we propose to develop an 
efficient and secure cooperative key generation protocol and 
provide an information-theoretic study on maximum key rate 
using techniques from both information theory and estimation 
theory. The proposed design should satisfy the following 
requirements: i) High key rate. Given the intermittent con- 
nectivity in mobile environments, the key generation scheme 
should have a high key rate; ii) Sound key randomness. As 
cryptographic keys need to be as random as possible so 
that it is infeasible to reproduce them or predict them, the 
resulting key bits should have a high level of entropy. Note 
that the existing schemes usually rely on channel variations 
or node mobility to extract high entropy bits. We propose to 
remove this constraint and establish random keys even in static 
environments. 



B. Narrowband and Wideband Fading Channels 

An important characteristic of a multipath channel is the 
delay spread v it causes to the signal [11^ If v is large, 
the multipath components are typically resolvable, leading to 
the wideband fading channel, where the resulting probability 
distributions for the gains of multipath channel paths are often 
modeled as log-normal or Nakagami [12]. If v is small, the 
multipath components are typically nonresolvable, leading to 
the narrowband fading channel, where the amplitude gain is 
Rayleigh distributed. 

In this paper, we will focus on a narrowband fading system 
for secret key generation. Our approach can also apply to 
wideband fading channels. But as will be shown, it suits best 
for narrowband fading channel model. Let the transmitted 
signal be 

x{t) = $K{ii(t)eJ'2"-^=*}, 

where u{t) is the complex envelope of x{t) with bandwidth 
B and fc is its carrier frequency. Assume the equivalent 
lowpass time-varying channel impulse response is h{T,t) = 



rit) 



x{t)*h{T,t) (1) 
$H < I / h{T, t)u{t -T)dx\ e^'^^'f"' 




where a„(t) is a function of path loss and shadowing while 
(t)n{t) depends on delay, Doppler, and carrier offset. Typically, 
it is assumed that these two random processes and 
(/)„(t) are independent. Note N{t) is the number of resolvable 
multipath components. For narrowband fading channels, each 
term in the sum of Eq. ([T]i results from nonresolvable multipath 
components. 

Under most delay spread characterizations, v ^ 1/B 
implies that the delay associated with the fcth multipath 
component Tk < v Vfc, so u{t — Tk) ~ u{t). If x{t) is 
assumed to be an unmodulated carrier (single-tone signal) 
x{t) = yile^^'^f"'^} = cos2tt fct, it is narrowband for any 
V. With these assumptions, the received signal becomes 



r{t) = d\ 




a„(i)e-J''^"(*) I eJ'2"-^=* ) (2) 



written as 



6{t — Tn{t)), the received signal can be 



= r/(t)cos27r/ci- rQ(<)sin27r/ci, 

where the in-phase and quadrature components are given by 
flit) = Enii ""(*)coS(?!)„(i) and rgit) = 
sin (/>„(<:), respectively. The in-phase and quadrature com- 
ponents of Rayleigh fading process are jointly Gaussian 
random process. The complex "lowpass" equivalent signal 
for r{t) is given by ri{t) + jrQ{t) which has phase 9 = 
a.Tcta.n{rQ{t)/rj{t)), where 6 is uniformly distributed, i.e., 
9 e U[0,2t:]. So ri(t) + jrQ{t) can be written as r/(t) + 
jr qjt) ^ |/i|eJ^ = |/i| cosS + sin^, where \h\ = 
^/ri{t)^ + rQ{t)^. Consider the additive white Gaussian noise 
(AWGN) in the channel, Eq. Q can be written as 

r(t) ^ |/i| cos 6* cos 27r/ct - sin 61 sin 27r/c< + ri(i)(3) 
= \h\cos{2TTfc,t + 9) + n{t), 

where n{t) is a Gaussian noise process with power spectral 
density We will estimate parameters in r{t) and use 
the uniformly distributed phase of multipath channel for key 
generation. A list of important notation is shown in Table. |T] 

III. Related Work 

The PHY based key generation can be traced back to the 
original information-theoretic formulation of secure commu- 
nication due to |1|. Building on information theory, |[2l, (O, 
characterized the fundamental bounds and showed the 
feasibility of generating keys using external random source- 
channel impulse response. To the best of our knowledge, the 
first key generation scheme suitable for wireless network was 
proposed in 151. In p|, the differential phase between two 
frequency tones is encoded for key generation. Error control 



TABLE I 

A SUMMARY OF IMPORTANT NOTATION. 
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Pe 


the bit error probability (BhR) 




coherence time 


V 


delay spread 


Q 


the number of cjuantization intervals 


T (T ^ 


observation time or beacon duration time 


Ns 


the number of samples in the observation time 


No 


the one-sided power spectra density (PSD) 




channel gains 


is 


sampling rate 


N 


number of relay nodes 




key rate from mutual information with no relay 


key rate from CRB with no relay 


^CO 


cooperative key rate from mutual information 


pCRB 
^co 


cooperative key rate from CRB 



coding techniques are used for enhancing the reliability of 
key generation. Similar to [51, a technique of using random 
phase for extracting secret keys in an OFDM system through 
channel estimation and quantization was recently proposed in 
[131. This paper characterized the probability of generating the 
same bit vector between two nodes as a function of signal-to- 
interference-and-noise (SINR) and quantization levels. 

A key generation scheme based on extracting secret bits 
from correlated deep fades was proposed in |6| and dis- 
tinguished from the aforementioned work by using received 
signal strength (RSS) as the random source via a TDD link 
for the protocol design. Two cryptographic tools- information 
reconciliation and privacy amplification are used to eliminate 
bit vector discrepancies due to RSS measurement asymmetry. 
The final key agreement is achieved by leaking out mini- 
mal information for error correcting and sacrificing a certain 
amount of entropy for generating nearly perfect random secret 
bits. In [7J, the authors proposed two key generation schemes 
based on channel impulse response (CIR) estimation and 
RSS measurements. Different from |6|, the two transceivers 
alternately send known probe signals to each another and 
estimate the magnitude of channel response at successive time 
instants. The excursions in the fading channels are used for 
generating bits and the timing of excursions are used for key 
reconciliation. The resulting sequence are further filtered and 
quantized using a 1-bit quantizer, which results in low key 
bit rate. Motivated by observations from quantizing jointly 
Gaussian process, a more general key generation scheme 
was proposed by exploiting empirical measurements to set 
quantization boundaries in [lOj . Working on the same RSS 
based approach, JSl evaluated the effectiveness of RSS based 
key extraction in real environments. It has been shown that 
due to lack of channel variations static environments are not 
suitable for establishing secure keys, and node mobility helps 
to generate key bits with high entropy. The most recent work 
ifPl-J proposed an efficient and scalable key generation scheme 
that supports both pairwise and group key establishments. 

Due to noise, interference and other factors in the key gener- 
ation process, discrepancies may exist between the generated 
bit streams. Variants of this problem have been extensively 
explored under the names information reconciliation, privacy 



amplification and fuzzy extractors. ifTSl proposed the first pro- 
tocol to solve the information-theoretic key agreement problem 
between two parties that initially posses only correlated weak 
secrets. The key agreement was shown to be theoretically 
feasible when the information that the two bit strings contain 
about each other is more than the information that the eaves- 
dropper has about them. lfT6l used error-correcting techniques 
to design a protocol that is computationally efficient for 
different distance metrics. Based on the previous results, ifTTll 
proposed a protocol that is efficient for both parties and has 
both lower round complexity and lower entropy loss. Recently, 
ifTSl proposed a two round key agreement protocol for the 
same settings as iflTl . 

IV. The Proposed Solutions 

In this section, we present our cooperative key generation 
algorithms for extracting secret bits from wireless channels. 
The proposed algorithms employ the technique of single-tone 
parameter estimation to estimate the uniformly distributed 
channel phase. When keying nodes A and B alternately 
transmit known single-tone signals to each other, each relay 
node also observes the fading signals transmitted through the 
pairwise links between him and the keying nodes. Therefore, 
with the aid of relay nodes, the keying nodes A and B can 
potentially increase the key rate using additional randomness 
in the same coherence time interval. 

A. Utilizing a Single Relay 

We fist consider the single relay case where one relay node 
acts as a helper to facilitate the key generation between the 
keying nodes A and B. The basic idea is that an unmodulated 
carrier {i.e., single-tone signal) is transmitted through the 
fading channels back and forth between the keying nodes, and 
the keying nodes perform maximum Likelihood Estimation 
(MLE) based on their observation. Since each bidirectional 
channel between a pair of nodes is a time-division-duplex 
(TDD) channel, which is reciprocal in both directions, it will 
incur the same total phase shift caused by multipath due to the 
channel reciprocity principle. Generally, the protocol consists 
of two main phases: i) Single-tone phase estimation and 
quantization; ii) Key reconciliation and privacy amplification. 

Before we introduce the cooperative key generation proto- 
col, we first introduce the fundamental building block- MLE 
used in single-tone signal parameter estimations. During the 
protocol execution, the keying nodes A, B and relay nodes 
use MLE to estimate the parameters of a single-tone signal 
with a known signal model. Given certain observation set Z 
and parameter set S, the objective of MLE is to estimate the 
parameter set that maximizes the pdf of Z. In our application, 
the received signal model can be written as 

r(t) = hoCos{LOot + eo) + n{t), (4) 

where S = {bi^.tUQ^Oo] are the unknown parameters (ampli- 
tude, frequency and phase, respectively) to be estimated. The 
received signal is sampled at a constant sampling frequency 
rate fs = l/Tg to produce the discrete-time observation 

r[m] — bo cos{wo{to + mTs) + 6o) + n[m] (5) 
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Fig. 1. Protocol for cooperative key generation with one relay. 



for TO = 0,1, ... ^ Ns — 1. Here, tp denotes the time of the 
first sample and n[TO]s are Gaussian random samples with zero 
mean and variance cr^. Let Z = (r[0], r[l], . . . , r[Ns — 1]), the 
pdf of Z is 119] 

f(Z;a) = (— ^) expj-^ ^ (rH -mH)' 

\ (TV / m— 

where fj,[m] = 6ocos(wo(^o + "i^s) + ^o)- In the following 
discussion, we ignore discussion on the estimation of signal 
amplitude bo since its estimation is independent from the 
estimation of frequency and phase. 

The Ng samples in Eq. (|5]l is provided as an input of the 
MLE estimator According to the results in ||T9l , the maximum 
of function f{Z,a) is achieved when 

Thus, we can first estimate the frequency of the signal, and 
then calculate the ML estimate of the phase using Eq. 
Specifically, the MLE is implemented in three steps: 

1) Rough frequency search. We calculate the Discrete-time 
Fourier Transformation (DFT) of Z and find the k — 
argmaxfc |i?[a;fe]|, where ujk = and N is the length 
of the DFT. Here, N is chosen to be a power of 2 and 
greater than Ng. Then we can calculate the roughly esti- 
mated frequency as uji — Such frequency estimate 
has large estimation error due to the limited resolution of 
the DFT. Thus, a more accurate estimation is desired; 

2) Fine frequency search. Based on the rough estimation in 
the last step, we can calculate the Cj by maximizing func- 
tion |i?(w)|, where i?(a;) is the continuous DFT of the 
sample sequence r[m\ in the interval [^i^^^li, ^I^hllZL], 
The fine search algorithm locates the value of uj closest 
to LOi that maximizes The secant method is used 
to compute successive approximations to the frequency 
estimate uj ~ argmax^^ 

3) Phase estimation. The phase estimate can be calculated 
by substituting uj to Eq. 



The performance of MLE is measured by the variance of the 
estimation errors. This variance can be lower-bounded by the 
Cramer-Rao bound (CRB) |20|. The performance of the ML 
estimator, which is closely related to the performance of the 
proposed cooperative key generation scheme, will be discussed 
and analyzed later We present the cooperative key generation 
protocol as follows (See Fig. [T]): 

Phase One: Single-tone phase estunation and quantization. 

TSi: The protocol begins in timeslot 1 with transmission of a 
sinusoidal primary beacon of duration Ti from node A: 

xi [t) — ai cos{wc(t — ti)), 

where t e [ti,ti+Ti). To simplify the exposition, we assume 
ti = in the following discussion, i.e., the protocol starts at 
time zero point. 

Node B (Ri) observes the initial transient response of the 
multipath channel Ha. sit) (hA.Ri (t)) to the beacon xi{t) over 
the interval t £ [tab , tab + vab) (t € [taRi , tar.^ + var^ )), 
where tab {taRi ) denotes the delay of the shortest path and 
VAB {vARi) denotes the finite delay spread of the channel 
hA.B{t) {hA.Ri (t))- In order to achieve a steady-state response 
at both B and Ri, it is required that Ti > ma.x{i/AB, i^aRi}- 
The "steady-state" portion of the beacons received at B and 
Ri can be written as 

At B : TABit) = aiaAB cos{wct + 9ab) + nABit), 
At i?i : VAR^it) = aiaARiCOs{wct + 0ARi) +nARAt), 

where t e [tab + ^ab , tab +Ti) (t e [tari+vaRi, tar^ + 
Ti)) for B (i?i), and UABit) (riARiit)) denotes the additive 
white Gaussian noise (AWGN) in the. A ^ B (A ^ Ri) 
channel. aAB (ctARi) and Bab (OaRi) are the steady-state 
gain and the phase response of channel hA,B{t) (/ia,_Ri (i)), 
respectively. At the end of primary beacon, a final transient 
response of the multipath channel is also received by B (Ri) 
over the interval t G [tab+Ti,tab + '^ab+Ti) (t g [tar^ + 
Ti,TARi + VARi + Ti)). B (i?i) uses only the steady-state 
portion of the noisy observation to compute ML estimates of 
the received frequency and phase, which are denoted by wab 
(war,) and Oab 0ari), respectively. 



TS2: Upon the conclusion of the primary beacon rAsit), in 
timeslot 2, B begins the transmission of a sinusoidal secondary 
beacon at t2 = max{T^B + i/ab + Ti,tari + i^ar.i + Ti}. 
The secondary beacon transmitted by B at t2 can be written 
as 

X2it) = 02 COs{Wcit - 12)), 

where t € [^2,^2 + 72)- A (Ri) observes the initial transient 
response of the multipath channel hs.Ait) (hsMiit)) to 
beacon 2:2 (t) over the interval t £ [t2 + TBA,t2+TBA + i^BA) 
(t e [t2 + TBRi,t2 + TBB.^ + VBRt)), where vba = i'ab 
(yBRi = vr^b) due to channel reciprocity. In order to 
achieve a steady-state response at both A and Ri, T2 > 
maxli^BAji^BRi} is required. The steady-state portion of the 
beacons received at B and Ri can be written as 

At A: TBAit) = a2aBAC0s{wct + 6ba) + nBAit), 
At _Ri : TBR^it) = a2aBRiCos{wct + 6BRi)+nBRi{t), 

where i G [<2 + tba + i^BA,t2 + tba + T2) (t G [<2 + tbr^ + 
i^BRi,t2 + TBRi + T2)) for A (i?i), and nBA{t) {nBR^it)) 
denotes the additive white Gaussian noise (AWGN) in the 
B ^ A {B ^ Ri) channel, uba (aBRi) and 6ba (GbRi) 
are the steady-state gain and the phase response of channel 
hB.A{t) {hB,Ri{t)), respectively. At the end of this beacon, 
a final transient response of the multipath channel is received 
by A (i?i) over the interval i e [i2 + tra + ^2, <2 + tba + 

T2 + ^Ba) (t G [t2 + TBRi + T2, t2 + TBRi +T2 + VBRi))- 

Similar to TSi, A (Ri) uses only the steady-state portion of 
the noisy observation to compute ML estimates of the received 
frequency and phase, which are denoted by wra (wbRi) and 
Or A (ObRi), respectively. 

TS3: Upon the conclusion of the primary beacon rRR^{t), 
in timeslot 3 i?i begins the transmission of a sinusoidal 
secondary beacon at = max{i2 + tra + vra + 72, t2 + 
TBRi + VBRi + Tb}- The third beacon transmitted by Ri at 
can be written as 

where t G [^3,^3 + T3). A (B) observes the initial transient 
response of the multipath channel /i_Ri.a(0 {hRi.R{t)) to 
beacon a;3(i) over the interval i G [t3+TR^A,t3+TR,A+'^RiA) 
(t G [ts + TR^R,t2 + tr^r + vr^r)), where vr^a = ^aRi 
(PRiR = vrr^) due to channel reciprocity. In order to 
achieve a steady-state response at both A and B, > 
msx{vR^A, vr^b} is required. The steady-state portion of the 
beacons received at A and B can be written as 

At ^ : TR^Ait) = azaR^AC0s{wct + 9R^A) + nR^A{t), 
MB: rR^B{t) = a^aR^R cos{wct + Or^r) + nR^R{t), 

where t G [<3 + TfliA + i^fliA, is + ^iiiA + Ts) (t G [t3 + T7?,iB + 
VRiB,h + TR^R + T3)) for A (B), and nR^A{t) {nR^B{t)) 
denotes the additive white Gaussian noise (AWGN) in the 
Ri^ A (i?i B) channel, ur^a {ocr^b) and 6*^,^ {Or^b) 
are the steady-state gain and the phase response of channel 
hRi,A{t) {hR^^R{t)), respectively. At the end of this beacon, 
a final transient response of the multipath channel is received 



by A (B) over the interval t e [t^ + tj^^a + Ts, ^3 + tr^a + 

T3 + I^Ria) (t G [^3 + TRiB + T3, + Tji^B +T3 + Vr^b))- 

Similar to TS2, A (B) uses only the steady-state portion of 
the noisy observation to compute ML estimates of the received 
frequency and phase, which are denoted by wr-^a (wrj^r) and 
0RiA (Orib), respectively. 

Quantization. To generate high-entropy bits, we assume A, 
B and Ri run the above steps once during each coherence 
time interval. For ease of exposition, we term the above steps 
as round 1. After round 1, each of the three nodes has two 
phase estimates for quantization 

A : Ora mod 2tt , Or^a mod 2it 
B : 9ab mod 27r , Or^r mod 2it 
• ^ARi mod 27r , Obri mod 27r 

Each node uniformly maps their phase estimates into the 
quantization interval/index using the following formula: 

Q{x) = k if a; G — ^ ) 

q q 

for k = l,2,...,q. Therefore, in the first round, the quan- 
tization of each phase value generates log2(g) secret bits. 
Due to channel reciprocity principle, A and B share log2(g) 
bits generated from 9b a ((^ab)'^ A and Ri share log2(<7) bits 
generated from Oj^-^a (Oari)', B and Ri share \og2{q) bits 
generated from 9r^r {6rr^). Note the quantization index k is 
encoded into bit vectors. In our implementation, we use gray 
codes to reduce the bit error probability (BER). 

Assume the desired key size is \K\. For round k = 
2,3,..., 2 ilg \q) ' A, B and Ri repeat the operations as in 
TSi, TS2 and TS3 to generate phase estimates and convert 
them into bit vectors through g-level quantization. 

After „, , rounds, a key of size ^ is shared between 

2log2(g) ' ^ 2 

A and B, which is denoted as Ki. Similarly, a key of size ^ 
is shared between A and Ri, which is denoted as K2', a key 
of size is shared between B and Ri, which is denoted 
as K3. Then i?i computes K2 ffi K3 and transmits it over the 
public channel. A receives the XOR information and computes 
K2 ® {K2 ® K3) = K3. Similarly, B obtains K2 by K3 © 
{K2 © K3) = K2. Now both A and B have keys Ki,K2 and 

Finally, A and B set the final key as Ki\\K2 or _fi'i||i<'3, 
and a secret key with size \K\ is established. Note that we 
use either K2 or K3 instead of both as the component of the 
final key. The reason is that with either one of K2 and K3 the 
eavesdropper can recover the other one by leveraging K2(BK3. 

Phase Two: Key reconciliation and privacy amplification. 

Due to reciprocity principle, the generated bit sequence at A 
and B should be identical. However, there may exist a small 
number of bit discrepancies due to estimation errors, hardware 
variations and half-duplex beacon transmission. These error 
bits can be corrected using key reconciliation techniques 1 17|, 
ET\ . Assume A and B hold K and K', respectively. And 
the Hamming distance dis(/'ir, K') < t. Following Code-offset 
construction in we use a [n, fc, 2t + 1]2 error-correcting 
code C to correct errors in K' even though K' may not be 
in C. When performing key reconciliation, node A randomly 



selects a codeword c from C and computes secure sketch 
SS{K) ~ s = K (B c. Then s is sent to node B. Upon 
receiving s, node B subtracts the shift s from K' and gets 
Rec{K',s) c' ^ K' (S s. Then node B decodes c' to get 
c, and computes K by shifting back to get K = c (B s. Note 
that since the error-correcting information s is public to both 
the communicating nodes and the adversary, it can be used 
by the adversary to guess portions of the generated key IH. 
To cope with this problem, A and B can further run privacy 
amplification protocols ifTTl to recover the entropy loss. 

B. Exploiting Multiple Relays 

In this subsection, we present the key generation protocol 
with multiple relay nodes. As discussed above, when there 
exists only one relay i?i , he can contribute logj q bits in each 
coherence time interval. Since the beacon duration (observa- 
tion time) Ti is relatively small compared to the coherence 
time, a large portion of the coherence time interval cannot 
be effectively utilized. This motivates us to incorporate more 
relays into the key generation process with potential two 
advantages: i) the key rate is further increased due to multiple 
relays' contribution during the same coherence time interval. 
This also implies that even if the nodes or the environment 
remain static, a key with high entropy can be generated quickly 
since it employs the randomness of multiple different pairwise 
links; ii) the security strength is further enhanced as each 
relay only contributes a small portion of secret bits to the 
final key. That implies, even if a small number of relays are 
compromised, the adversary can never obtain the complete 
global key bit information. 

With the aid of TV relay nodes, the protocol has a total of 
iV + 2 timeslots for each round (during one coherence time 
interval Tc). Assume the coherence time are roughly divided 
to portions, each with length -^^^ The activities in each 
timeslot of round 1 are as follows (for ease of exposition, we 
ignore the explicit value of ti for i = 1, 2, . . . , + 2): 

1) In TSi, node A transmits a sinusoidal primary beacon 
xi{t). Node B {R-j, j = {1, 2, . . . , A^}) neglects the 
initial and final transient portions of the received signal 
and uses the steady portion to compute the channel phase 
estimates Oab 0arJ- 

2) In TS2, node B transmits a sinusoidal secondary beacon 
X2it). Node A (Rj, j = {1, 2, . . . , A^}) neglects the 
initial and final transient portions of the received signal 
and uses the steady portion to compute the channel phase 
estimates 63 a (ObRj)- 

3) In TS, (i = {3,4, ...,A^ + 2}), node Rk (j = 
{1, 2, . . . , A^}) alternately transmits a sinusoidal beacon 
Xi{t). Nodes A and B neglect the initial and final 
transient portions of the received signal and use the steady 
portion to compute the channel phase estimates Or. a 
(eR^B)ioT j = {l,2,...,N}. 

Assume the desired key size is \K\. For round k = 
2, 3, ... , (^]^_^_ly^lg (^gy A, B and _Ri repeat the operations as in 
TSi,TS2, ■ . • ,TS7v+2 to generate phase estimates and convert 
them into bit vectors through g-level quantization. 



After , -|L ' — TT rounds, a key of size ^ttt is shared 

(Af+l)log2(9) ^ N+1 

between A and E, which is denoted as Ai. Similarly, a key 
of size is shared between A and Rj {j ~ {1,2, . . . , N}), 

which is denoted as Kji; a key of size -j^^ is shared between 
B and Rj {j = {1, 2, . . . , A^}), which is denoted as Kj2- 
Then Rj computes Kji Kj2 and transmits it over the public 
channel. A receives the XOR information and computes ATji® 
{Kji®Kj2) = Kj2- Similarly, B obtains K^i by Kj2®{Kji(B 
Kj2) = Kji. Now both A and B have 2N + 1 keys Ki, Kji 
and Kj2 for j = {1,2,..., A}. 

Finally, A and B set the final key as 
KiWiKii or Ku)\\{K2i or i^22)|| • • • IKi^A^i or Kn2)- 
The key reconciliation and privacy amplification phase is 
the same as the single relay case. Note that since a single 
coherence time interval is evenly allocated to the keying 
nodes and relay nodes, the increase of A^ results in the 
decrease of available observation time To (beacon duration 
Ti). As will be shown later, this would lead to the increase 
of estimation errors in MLE. Therefore, there must exist an 
optimal maximum A^ under which key rate is maximized. 

V. Theoretical Performance Analysis 

In this section, we analyze the performance of the coop- 
erative key generation protocol in terms of the maximum 
key rate the system can achieve. In information theory, the 
mutual information of two random variables/sequences is a 
quantity that measures the mutual dependence of the two 
variables/sequences. Therefore, the secret key rate can be 
upper bounded by the mutual information between the obser- 
vations of two transceivers. Motivated by this, we first provide 
an information-theoretic study into the upper bound on the 
key rate using mutual information. This bound denotes the 
maximum key rate that can be generated from the common 
randomness between the keying nodes. In estimation theory, 
Cramer-Rao bound provides a lower bound on the variance of 
biased and unbiased estimators of a deterministic parameter. 
Since we utilize maximum likelihood estimation (MLE) in our 
proposed key generation protocol, we also propose to derive 
a tighter bound on the key rate using the Cramer-Rao bound 
(CRB). 

A. Knowing the Limit: The Upper Bound on Key Rate from 
Mutual Information 

In this subsection, we analyze the mutual information be- 
tween the observations of two nodes i and j at two ends of a 
multipath fading channel. We start the analysis from the no- 
relay case. As shown above, all the received signals can be 
expressed as Eq. (|4]l. These single-tone signals can be precisely 
reconstructed from samples taken at sampling rate greater or 
equal at Nyquist rate fg = ^ = 2/c (Note in the following 
analysis, we choose ^ 2/^). The discrete-time observation 
at nodes i and j are 

Tij [m] = aaij cos{wc{Uj + mTs) + + 7i,y [m] (7) 
rji[m] = aa.jiCOs{wc{tji + mTs) + Oji) + nji[m] (2,) 

for r7i = 0,l,...,A^s — 1, where tij (tji) denotes the time of 
the first sample. Note that when there is no relay, nodes A 



and B each can generate samples by fully exploiting the 
coherence time interval. That is, if we neglect the transmission 
delay, delay spread and processing delay, the observation time 
(i.e., beacon duration) is To ~ Thus, Ng — Tofs = 

Let = [ry[0],r„-[l],...,r,j[7V, - 1]] and = 

[rji[0],rji[l],...,rji[Ns — 1]] denote the samples ob- 
tained at nodes j and i, respectively. According to lfT2ll . 
I{nj{t);rji{t)) = I{Rij;Rji) as r(t) is fully defined by R. 

In practice, given a set X of independent identically dis- 
tributed data conditioned on an unknown parameter 9, a 
sufficient statistic is a function r(X) whose value contains 
all the information needed to compute any estimate of the 
parameter (e.g. a maximum likelihood estimate (MLE)). For 
ease of exposition, we rewrite Eq. ^ here 

r{t) — cos 6* cos 27r /ct — sin sin 27r /ct + n(t) 
= \h\cos(2Trfct + 0) + n{t). 

In MLE estimation, \h\ cos(27r/ct + 9) + n{t) is sampled to 
estimate \h\ and 9, where the complex expression of multipath 
channel is /i = l/ije-'^. Once \h\ and 9 are obtained, the terms 
\h\ cos 9 cos 27r/ci and \h\ sin 9 sin 271 f J. are both determined. 
So it is equivalent to sample and estimate a signal like 
\h\ cos 9 cos 27r/ci or sin 9 sin 27r/ci to fully determine the 
fading channel information. The "equivalent" received signals 
at nodes i and j can be written as 



r 1, \m\ 



auij cos 9 cos{wc{tij + mTs)) + riij [m] 
auji cos9 cos{wc{tji + mTs)) + nji[m] 



for m ^ 0,1, Ns — 1. Because r[m] is fully defined by 
r[m] and vice versa, the mutual information between r^j [to] 
and rji[m] is the same as that between rij[m] and rji[m], 
i.e., I{R.ij;Rji) = I{Rij;Rji), where Rij and Rji are the 
discrete-time sequences of rij[m] and rji[m], respectively. 

Now the problem becomes a Gaussion random vari- 
able estimation problem, where the in-phase component 
ri{t) = \h\cos9 = a cos 9 is to be estimated (in 
the following, we abuse standard notation by letting 
denote the in-phase component). Let Si = Sj = 
[a cos{'Wc{OTs)), a cos{wc{lTs)), . . . ,a coa{wc{mTs))]. Both 
nodes i and j can compute a sufficient statistic Rji and R^^ 
for Rji and Rij respectively ll22l 
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Theorem 1: Let ~ -^(0,(72) and N,,Nj ~ 

Af{0, cr^). Based on sufficient statistics (Rji, Rij) at two ends, 
nodes i and j can generate secret key bits at rate 



R 



MI 



ln2, , 
= loa-,(l + — 



2a^alNsP 



where P denotes the transmission power, Ng denotes the 
number of samples and Tc is the coherence time. 

Proof: See Appendix lAl ■ 



In the above discussions, we focus on two nodes i and j with 
no relay node. We next analyze the key rate when there are 
N relay nodes. If the sampling rate fs is fixed, the coherence 
time Tc which contains 2Ns samples is divided into iV + 2 
shares. From the nodes A and S's point of view, they each 
"sends" samples. Thus, the cooperative key generate rate 
is 



^h\N+2> ^ 



a^ + 2a^aU^,)P 



(11) 



Although the mutual information between each node pairs 
decreases due to the reduction of number of samples, the relay 
nodes help A and B to establish more key components, this 
gain becomes more significant when SNR increases or the 
channel changes very slowly. We have the following theorem 

Theorem 2: When there are N relay nodes, the cooperative 
gain is 
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(12) 



(13) 



where Rf^ 

As we can see, the gain of cooperative key generation is 
similar to the beamforming gain in cooperative networking, 
which is linear to the number of relay nodes. 

B. A More Practical Bound: The Upper Bound on Key Rate 
from Cramer-Rao bound (CRB) 

In the last subsection, we derive a theoretical upper bound 
on key rate from mutual information. This bound serves as a 
universal bound in the sense that it does not depend on the 
specific method of estimation, and it is not tight in general. 
Therefore, we next compute a more practical and tighter bound 
on key rate from Cramer-Rao bound (CRB) in estimation 
theory. 

In the existing RSS based key generation methods, the 
signal envelops are sampled and quantized for the calculation 
of secret bits. By using the signal envelop, there exists a trade- 
off between the reduction of the sensitivity of the system to 
timing error and the loss of variability in the resulting key 1 12). 
Different from that, in this paper, we use the uniformly 
distributed channel phase for key generation to achieve a high 
level of entropy. One of the most important properties of 
Maximum Likelihood estimators (MLE) is that it attains the 
Cramer-Rao bound at least asymptotically. Similarly, starting 
from the no-relay case, we have the following theorem: 

Theorem 3: When maximum likelihood estimation (MLE) 
and uniform quantization are used, the expected key rate is 
upper-bounded by 

yCRB _ PqjA log2 q 



Tr. 



where Vqia is the average probability of quantization index 
agreement. 

Proof: See Appendix iB] ■ 




Fig. 2. Key rate versus observation time To under different SNRs. 



When there are N relay nodes, the number of samples 
at each node is = ^fi^. We substitute for in 
Eq. ( l24l l and obtain the new CRB for 6. This bound is used 
to calculate Pqia- Thus, the expected key rate for cooperative 



QIA 

key generation becomes 



R 



CRB 



(iV + l)P=Q°,^log2g 



(14) 



It is easy to see that as q increases, node i and j could 
generate a longer bit vector during the same coherence time Tc- 
However, due to estimation errors the probability of generating 
the same bit vector becomes less. We can derive the maximum 
key agreement rate when q satisfies 

QJlCRB 

^^^=0- (15) 

oq 

From the above discussion, we conclude that there exists an 
optimal q at which maximum key rate can be achieved. We 
demonstrate how key rate changes as a function of q through 
simulations in Section FVll 

C. Numerical Illustration on Theoretical Upper Bounds 

Assume coherence time Tc = 14ms. The example in Fig. |2] 
presents the two upper bounds on key rate between two nodes 
(i.e., no relay) as the observation time To increases. The results 
show that the upper bound derived from mutual information 
serves as the universal upper bound on key rate. As expected, 
with a fixed number of quantization levels, the increase of 
SNR or To leads to the increase of key rate. Since there are 
only two nodes, the observation time for each node can be up 
to 7ms. When To changes from to 2.4ms, key rate increases 
rapidly, and it increases almost linearly as a function of To 
after 2.4ms. Hence, a less observation time can be properly 
chosen to still maintain an acceptable level of key rate. On the 
other hand, while the maximum To is constrained by Tc/2, one 
can further enhance the key rate by increasing SNR. 

Fig. |3] plots the upper bounds on key rate when the number 
of relays N increases. The close match of the bound from 



Fig. 3. key rate versus the number of relays N. Note that the observation 
time To is not fixed, i.e.. To decreases as increases. 

TABLE II 
Simulation Configuration 



Carrier frequency fc 


900 MHz 


Sampling frequency fs 


2.7 GHz 


Average moving speed v 


10 m/s 


Coherence time Tc 


14 ms 


Node distance d 


2 m - 10 m 


Delay spread u 


l.lflS 



mutual information and the bound from CRB before N = 500 
shows that, the CRB can be used to efficiently approach 
the universal upper bound when the nodes use ML phase 
estimation. Recall that as N increases, the observation time 
To for each node decreases because the whole coherence time 
are equally distributed to the keying nodes and relay nodes. 
Due to the fact that the decrease of To causes more estimation 
errors, there exists a threshold on key rate. This can be clearly 
observed from the results: the bound based on CRB gradually 
achieves the maximum and decreases after N — 2500. For the 
sake of clearly illustrating the inflection point on the bound 
curve from CRB, we limit the range of N in the figure. In 
fact, there also exists a inflection point on the bound curve 
from mutual information when N goes to infinity. 

Discussion. In our protocol, the keying nodes rely on a 
common time reference to generate absolute phase estimates. 
If there exists no common time reference among the nodes, 
each node has to count on its own local time obtained from its 
local oscillator. This implies that the phase estimate generated 
by each node will has an "unknown" offset associated with the 
node itself, which prevents the key generation protocol from 
working correctly. As a future direction, it is worthwhile to 
extend our protocol to overcome the effect of unknown phase 
offsets and allow key generation in the unsynchronized case. 

We are also going to build a simple prototype to validate the 
effectiveness of the protocol. The nodes can be implemented 
by TMS320C6713 DSKs boards, and the primary beacons 
can be generated and sent out by a function generator, e.g., 
HP33120A. In the implementation, we can use phase-locked 
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Fig. 4. Key rate versus the number of quantization levels q. 



Fig. 5. Bit probability error pe versus the number of quantization levels q. 



loops (PLLs) to realize phase and frequency estimation func- 
tions for improving the efficiency. Since each node transmits 
a periodic extension of a beacon received in a previous 
timeslot, the phase and frequency estimation functions during 
the synchronization timeslots can be realized by using phase- 
locked loops (PLLs) with holdover circuits, i.e., the PLLs are 
required for each node to store its local phase and frequency 
estimates during protocol execution. 

VI. Simulation Studies 
A. Key Rate and Bit Error Probability 

This section presents simulation results of the cooperative 
key generation protocol in multipath fading channels. In our 
simulation, we sample the beacon signal with sampling rate 
/s = 3/c, where fc = 900 MHz is the carrier frequency of 
the single-tone signal. In a mobile scenario, we assume the 
moving speed v = lOm/s. Thus, the Doppler frequency shift 
is — J ^ 30Hz, which results in a coherence time = 
° = 14ms. Assume v is the delay spread with a typical 
value 1.2 X 10 °s and the distance d between nodes changes 
from 2m to 10m. Thus, the random propagation delay r = - = 
6.67ns ^ 33.3ns. We choose To much larger than the delay 
spread v so that steady-state response can be achieved. The 
simulation settings are summarized in Table Two different 
methods are used here to estimate the variance of the phase 
estimation error: (i) full ML estimation and (ii) approximate 
analytical predictions using CRB. 

The first example considers the effect of quantization level 
q on key rate. Fig. |4] plots the key rate versus q given 
SNR =25dB and To =7.5/is using both the CRB analytical 
predictions and simulations. The results show two regimes of 
operation. In the small-quantization level regime, the effect of 
log2 q dominates the key rate. In this regime, the probability 
that two estimates fall into the same interval Pq/a is very 
high. Thus, the increase of q leads to the increase of key rate. 
According to Eq. ( fTSl l, when q begins to exceed a threshold. 
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Fig. 6. Bit error probability pe versus observation time To- 



the key rate begins to decrease and enters into the large- 
quantization level regime. In this regime, the key rate decreases 
quickly as q further increases. This is due to the fact that 
the estimation errors dominate the performance as the length 
of each interval ^ decreases, i.e., Pq/a is very sensitive to 
the estimation errors when the length of interval is small. As 
might be expected, the CRB can be used to efficiently predict 
the performance when q is relatively small, e.g., q < 10"^ in 
this setting. Since CRB is a lower bound on the variance of 
the estimation error, it takes a much larger q to reach the 
inflexion point compared to the simulation results. The above 
result intuitively suggests that an optimal q can be chosen 
to maximize the key rate. To evaluate the BER performance. 
Fig. |5] plots the bit error probability between two nodes as a 
function of q. The results show that, with a fixed Tq — 7.5/is, 
Pe can be maintained at a very low level if q < 100. We 
can use Gray codes (one bit of error is introduced between 
adjacent sectors) to encode the quantization indices to reduce 
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Fig. 7. Key rate versus the number of relays N. 

Pe- Also note that in these results, the coherence time is not 
fully exploited (i.e., the observation time To = 7.5 /j,s<C Tc), 
so one can also reduce so as to increase key rate by setting 
a larger Tq. 

Fig. |6] plots bit error probability Pe as a function of 
observation time To under SNR= 25 dB and q = 16. The 
results show that the increase of To is equivalent to the increase 
of SNR, which results in a close match of simulation results 
and CRB. Fig. |7] plots the key rate of the cooperative key 
generation protocol as the number of relay nodes increases 
when the quantization levels is fixed at g = 16. We choose 
To = 11 /is to maintain a high level of estimation accuracy. 
The results show that key rate increases linearly as a function 
of N, which confirms our previous analysis that the gain of 
cooperative key generation scales with the number of relays. 
As a final point on the results, we note that the further increase 
of SNR (e.g., from 25 dB to 40 dB) does not help much 
to improve the performance. This is because the estimation 
accuracy is already high enough when choosing a short q and 
a reasonable value of To. 

B. Key Randomness and The Effect of Mobility 

As we discussed above, the proposed cooperative key gen- 
eration scheme employs the inherent randomness of uniformly 
distributed channel phases in multipath narrowband fading 
channels. We employ a widely used randomness test suite 
NIST to verify the randomness of the secret-bit generated 
from our simulation l23\. To pass the test, all p-values must 
be greater than 0.01. In the test, we randomly select 10 bit 
sequences generated from our simulation and compute their 
p-values for 8 tests. The results in Table |lll] show that the 
average entropy of our generated bit sequences is very close 
to a truly random sequence. 

VII. Security Analysis 

In this section, we provide a security discussion for the 
proposed cooperative key generation scheme. We focus on 



both practical and analytical aspects. The security of the 
proposed key generation scheme is guaranteed based on the 
assumption that the adversary is not located near the legitimate 
parties, i.e.. A, B and other relay nodes. The is due to the 
spatial decorrelation fact: since the signal decorrelates over a 
distance of approximately one half length [11], it is almost 
impossible for an adversary which is located at a different 
place with the transceivers to obtain the identical channel 
response for key generation. That is, an entity which is at least 
A/2 away from the network nodes experiences fading channels 
to the nodes are statistically independent of the channels 
between the communicating nodes. As an example, consider 
a wireless system with 900MHz carrier frequency. If the 
adversary is more than 16cm away from the communicating 
nodes, it experiences independent channel variations such 
that no useful information is revealed to it. By passively 
observing the signals transmitted between legitimate nodes, 
it has been empirically shown in |10| that the eavesdropper 
cannot obtain any significant information about the signals 
received at legitimate nodes. 

Another key point regarding the security aspect is that we 
rely on the uniformity of the channel phase for extracting 
secret key bits in the narrowband fading channels. As dis- 
cussed in Section III-BI the complex lowpass equivalent signal 
for r{t) can be written as r^p — rj{t) + jrQ{t), where 
the phase of r(<) is 6* = arctan(^^^). For uncorrected 
Gaussian random variables ri{t) and rQ{t), it can be shown 
that 9 is uniformly distributed over [0, 27r] [11]. Consequently, 
our proposed PHY based key generation algorithm is best 
suited for the narrowband fading channels, where r{t) has a 
Rayleigh-distributed amplitude and uniform phase. We have 
the following theorem: 

Theorem 4: The cooperative key generation scheme is se- 
cure, i.e., the resulting secret key is effectively concealed from 
the eavesdropper observing the public information: 

-j^^I{Mo,Mi,M2,...,Mn]Kab.Kii,K2u...,Kni) < e 
Proof: See Appendix |C] 

■ 

VIII. Conclusion 

In this paper, a novel cooperative key generation protocol 
was developed to facilitate high-rate key generation in nar- 



rowband fading channels, where two keying nodes extract the 
phase randomness of the fading channel with the aid of relay 
node(s). For the first time, we explicitly considered the effect 
of estimation methods on the extraction of secret key bits from 
the underlying fading channels and focused on a popular sta- 
tistical method-maximum likelihood estimation (MLE). The 
performance of the cooperative key generation scheme was ex- 
tensively evaluated theoretically. We successfully established 
both a theoretical upper bound on the maximum secret key rate 
from mutual information of correlated random sources and a 
more practical upper bound from Cramer-Rao bound (CRB) in 
estimation theory. Numerical examples and simulation studies 
were also presented to demonstrate the performance of the 
cooperative key generation system. The results show that the 
key rate can be improved by a couple of orders of magnitude 
compared to the existing approaches. 

Appendix A 
Proof of Theorem 1 

Proof: From the above discussion, it is easy to see that 
Rio is a zero mean Gaussian random variable with variance 
+ MTP'' Similarly, Rji is a zero mean Gaussian random 

9 

w. Assume that nodes i 



And det(I]) is the determinant of E, which is computed by 
det(I]) = [al + ^f~ai (21) 
2aW , 



Thus, the mutual information between nodes i and j is 



/(R,-;R..) = l"21og,(l+ ^^^-^^,;,^^^ ). (22) 
time is Tc, 
/(Ry;Rji) 



Assume the coherence time is Tc, the maximum key rate is 
1 



(23) 



variable with variance 



iTsTTP 



and j transmit with power P — We have ||Si|| = 
||Sj|p « PNs- Obviously, (Ry,Rji) retains all the common 
randomness in (Ry ;Rjj). Thus, the mutual information 

Iinj{t);rj,{t)) = I{R^J■,RJ^) (16) 

The mutual information I{Rij;Rji) can be computed as 
follows 

/(R,,;R,,) = H{R,,) + H{R,,) - H{R,,,%,) 



ln2 f , 2 o-^ 

= — log. + 

+ — log. [^M'^l + J^^ 
= ln21og2 ( 2^e(f72 



where the superscript MI in R^^^ denotes that the key rate 
is derived as an upper bound from mutual information. ■ 

Appendix B 
Proof of Theorem 2 

Proof: To facilitate analysis, we assume that when the 
number of samples increases by using larger observation time, 
the estimation errors converge to zero-mean Gaussian random 
variables with variances a'j, which can be lower-bounded 
by the Cramer-Rao bounds (CRB) lHO). Fig. [8] plots both 
the distribution of the MLE errors using simulation and the 
CRB results. The simulation results show that variance of the 
estimation errors (J^i^^ — 1.6877 • 10~^ is lower-bounded 

by the CRB a^2^^ = 1.5616 • IQ-^. When estimating the 
unknown phase of a sampled sinusoid of amplitude a in white 
^j^^e with Power Spectral Density (PSD) the CRB for the 
variance of the phase estimate is given as 



2 4fsa'{2Ns 



1) 



)\-H{R,,,R,,) 



PN, 



Obviously, Ry and Rji form a multivariate normal distri 
bution, thus 



H{Rij,Rji) 



In 2 



log2[(27re)2det(S)], 



where S is the covariance matrix of vector 

\ ''l + Tm Cov(R,„R,,) 
[cov(R,„R,,) (7,2 + 7^ 

The covariance of Rij,Rji is calculated by 
Cov(Rjj,Rj,) = E(RyRj,) - E[R,,-1E[R 



a^Ns{Ns + l) ' ' ' ^^^^ 

where fs is the sampling rate, Ng is the number of samples 
^ ^ in the observation, and To is the observation time (i.e., beacon 
— H{Rij,Rji). duration) in second. The approximations can be obtained by 
assuming that Ng is large and the fact that Ns/ fs — To — 

Consider Eq. we assume = aa is the received 
signal strength (we neglect the subscript i,j for simplicity). 
The amplitude response of the fading channel a is Rayleigh 
distributed, and E[a^] — 2af^, then af = 2af^a^. Hence, the 
CRB bound for the received signal can be expressed as a 
function of SNR and N. 



(18) 



Riji R 



E 



|s,IP 



IS. 



, I.e., 
(19) 

(20) 

12^.) 



SNRTV^ 



where 



SNR: 



2Nofs 



(25) 



(26) 



E[h^ 



Suppose [0, 27r] is divided into q — 2^ levels. Now we 
analyze the probability that nodes i and j's estimations fall 
into the same interval when performing quantization. Let 
Pq/a denote the average probability of quantization index 



^^^1 MLE Simulation 
^^CRB 




SNR - 25dB 
T = 3ns 






-4-2 2 

Phase estimation error of IVILE 



Fig. 8. The comparison of ML estimation error distribution using simulation 
and CRB. 



agreement. Without loss of generality, assume that 9 falls 
into the z-th sector g {0,1,--- ,g - 1}). 

As phase estimation errors are independent and Gaussian 
distributed according to the CRB in Eq.jZSll. the probability 
that e^e + Oc, is (see Fig. B 



(27) 



where i' € {0, 1, • • • , g — 1} and d is the estimation error 

Thus, Pq/a can be computed as Pq/a(6') = I]?'=o (^)^- 
Note that Vqia (&) is a function of Q. The value of Pq/a {&) 
goes up when the "true" Q approximates the center of a sector 
and down when Q is close to the boundaries of a sector. In 
fact, given (\> e [0, 27r], Pq/a(6') is symmetric to the center of 
a sector and is changing periodically with period I-k jq. Our 
simulation results indicate that the variance of phase estimate 
is much smaller than one. Thus, given d e [~; 27r(^+i) -^^ 
Pq/a(^) is mainly determined by Pi(f?) (i' = i). Based on 
the above analysis, we can compute the average probability of 
quantization index agreement as 



QIA 



(28) 



When nodes i and j's estimates lie in the same interval, they 
agree on a bit vector of length log2 q\ otherwise they agree on 
no bit. Hence, the expected key rate is 



R 



CRB 



Pqia iog2 q 



(29) 



Note that pe ~ 1 — Pqia if we assume zero bits are gen- 
erated when two nodes' estimates fall into different intervals. 
If gray codes are utilized, pe sa 1 — Pqia/ log2 <Z- ■ 
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Fig. 9. An illustration of estimation error distribution on quantization 
intervals. 



Appendix C 
Proof of Theorem 3 

Proof: Assume N relay nodes are involved with the key 
establishment. An eavesdropper E monitors all the commu- 
nications and tries to use these information to find the secret 
key. Without loss of generality, we assume the key can be 
established in one round. We have 

/(Rab;Rba) = Kab (30) 
I{Rar/,Rr^a) = Kj, (31) 
I(Rbr/,Rr^b) - K,2 (32) 

Suppose A and B always choose Kji as their key component. 
Let Mq = {Rae,Rbe}- The information E could learn dur- 
ing the agreement of Kji is Mj = {Rae, Rbe, Rr^e, Kji © 
Kj2}- Because channels between any two pair of nodes are 
independent, hence, for any e > 0, we have 

IiRAE,«-BE;KAB) < e (33) 

I(Rae,'Rbe,Rr,e;K,i) < e, (34) 



After the relay node Rj broadcasts Kji(BKj2, E learns Kji 
I{K,i®Kj2;K,i) = 0. 



Kj2. However 



(35) 



It is equivalent to a one-time-pad encryption on Kji with 
secrete key Kj2- Without knowing Kj2, E could learn nothing 
from the ciphertext Kji (B Kj2, thus we have 

I{Mf,Kji) = liRAE.P.BE^R.E-^Kji) + (36) 

I{Kji®Kj2]Kji) < e. 



The total information obtained by E is the set 
{Mq, Ml, M2, . . . , Mpf}, whose elements are independent of 
each other. On the other side, A and B obtain the key 
set {Kab, Kii, K21, ■ ■ ■ , Kpfi}, whose elements are also 
independent of each other. According to the independence 
of the random variables and the basic properties of mutual 
information, we have 

/(Afo, Ml, Af2, . . . , Mf,KAB,Kii, K21,..., Kji) 

n 

= I{Mo; Kab) + ^ I{Mf,Kji) < {N + l)e 
i=i 
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